Skip to content

allow BeautifulSoup configuration kwargs to be specified #224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

chrispy-snps
Copy link
Collaborator

Fixes #223 by extending #206 to also support the specification of full BeautifulSoup kwargs values.

If a simple string or list is passed, the original behavior from #206 is used to treat it as a features specification:

md = MarkdownConverter(bs4_options="html.parser")
md = MarkdownConverter(bs4_options=["lxml", "html.parser"])

But if a dictionary is passed, then it is treated as a full kwargs specification:

md = MarkdownConverter(bs4_options={"exclude_encodings": ["iso-8859-7"]})

@chrispy-snps chrispy-snps force-pushed the chrispy/support-bs4-options branch 2 times, most recently from 2d0a14a to 07d0500 Compare June 14, 2025 11:26
Copy link
Collaborator

@AlexVonB AlexVonB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good solution! Since the option name changes, you should at least make this a minor release.

How do you handle the command line options? Could we just... ignore this option for the CLI? :D

@chrispy-snps
Copy link
Collaborator Author

@AlexVonB - ack! Yes, I need to update the command-line script. I say we just accept only a string value for features and call it good. If you're doing anything more complex than that, it seems reasonable to do it from code.

Signed-off-by: chrispy <chrispy@synopsys.com>
@chrispy-snps chrispy-snps force-pushed the chrispy/support-bs4-options branch from 07d0500 to e9d3de8 Compare June 14, 2025 13:04
@chrispy-snps chrispy-snps merged commit 75ab306 into matthewwithanm:develop Jun 14, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add generalized support for specifying Beautiful Soup options
2 participants